Circumventing Negative Transfer via Cross Generative Initialisation

نویسندگان

  • Wenjun Bai
  • Changqin Quan
  • Zhi-Wei Luo
چکیده

Negative transfer – a special type of transfer learning – refers to the interference of the previous knowledge with new learning. In this research, through an empirical study, we demonstrate the futile defence to the negative transfer via conventional neural network based transfer techniques, i.e., mid-level feature extraction and knowledge distillation. Under a finer specification of transfer learning, we speculate the real culprits of negative transfer are the incongruence on task and model complexity and the ordering of learning. Based on this speculation, we propose a tentative transfer learning technique, i.e., cross generative initialisation, to sidestep the negative transfer. The effectiveness of cross generative initialisation was evaluated empirically. 1 UNAVAILING KNOWLEDGE TRANSFER IN NEGATIVE TRANSFER In inductive transfer learning (Pan & Yang, 2010), learning a target task is benefited by the transferred knowledge from a previous learned task, i.e., a source task. In neural network based transfer learning, the extracted mid-level features are served as the knowledge to transfer. The production of these transferable features is accomplished by running forward propagation of a trained neural network (Oquab et al., 2014). To optimise the usage of extracted features in transfer learning, Hinton et al (Hinton et al., 2015) proposed a knowledge distillation technique to allow the compression of a cumbersome model to a compact one. The distilled knowledge, i.e., the cross entropy loss of a learned neural network, is augmented to fine-tune a to-be-learned neural network. However, both techniques are futile in a special case of transfer learning: negative transfer. Negative transfer – a term borrowed from cognitive science – occurs in a situation where the prior learning of a source task interferes with the later learning of a target task (Pan & Yang, 2010). To formalise our discussion on negative transfer, considering two sequentially to-be-learned tasks with varying degrees of complexity; e.g., T1 (a complex task) and T2 (a simple task), we then assign two corresponding models, e.g., M1 (a cumbersome model) and M2 (a compact model) to learn the foregoing tasks. Dependent upon the congruence on task and model complexity and the learning sequence, there are four distinctive transfer learning cases, i.e., T1D1 → T2M2; T1M2 → T2M1; T2M1 → T1M2; T2M2 → T1M1. 1 As demonstrated in Table 1, negative transfer can only be circumvented in which the congruence on model and task complexity is high, and the ordering of learning is followed as complex to simple, i.e., T1M1 → T2M2. In other three cases, both mid-level feature extraction and knowledge distillation techniques failed to sidestep the negative transfer. → denotes the direction of knowledge transfer, e.g., T1D1 → T2M2 means the knowledge is extracted from a prior learning of a complex task through a cumbersome model, then transferred to assist the learning of a simple task through a compact model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sensitivity analysis of conceptual model calibration to initialisation bias. Application to karst spring discharge models

In this paper the perturbation approach is used to investigate the analytical properties of the sensitivity to the initial conditions on the calibration and simulation results of two karst spring discharge reservoir models. The propagation of uncertainty in the initial conditions is shown to depend on both model structure and the values assumed by state variables at the beginning of simulation....

متن کامل

SyncGAN: Synchronize the Latent Space of Cross-modal Generative Adversarial Networks

Generative adversarial network (GAN) has achieved impressive success on cross-domain generation, but it faces difficulty in cross-modal generation due to the lack of a common distribution between heterogeneous data. Most existing methods of conditional based cross-modal GANs adopt the strategy of one-directional transfer and have achieved preliminary success on text-to-image transfer. Instead o...

متن کامل

A Generative Analysis of the Acquisition of Negation by Iranian EFL Learners: A Typological Study

The present study was an attempt to investigate the acquisition of negationproperties by Persian monolingual and Kurdish-Persian bilingual learners of Englishacross different levels of language proficiency and within a generative framework.Generative models are generally concerned with issues such as universal grammar(UG), language transfer, and morphological variability in nonprimary languaged...

متن کامل

Voice-based Age and Gender Recognition using Training Generative Sparse Model

Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...

متن کامل

Advances in Active Appearance Models

This paper presents advances in the construction and use of Active Appearance Models (AAMs) for image interpretation. AAMs are photo-realistic generative models of object appearance that can be used to rapidly locate deformable objects in images. We extend the AAM method to include coloured texture and present an enhanced search algorithm with the ability to locate partially occluded objects. P...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018